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1 .   Introduction 

In  reliability  problems,  but  also  in  studies  of  logistics,  congestion 
systems  and  elsewhere,  it  is  common  to  encounter  collections  of  nominally 
similar  equipments  or  other  entities  that  generate  point  events  at  similar, 
but  not  identical,  rates.  The  questions  then  arise  as  to  whether  evidence 
for  differences  in  the  rates  can  be  elicited  from  event  rate  data  on  all 
members  of  such  a  collection,  and  how  the  data  can  be  well  utilized  to 
provide  strengthened  estimates  of  the  underlying  true  rates  of  the 
individual  equipments.   If  all  equipments  seem  to  have  about  the  same 
failure  rate  then  there  should  be  little  harm  in  calculating  a  simple  pooled 
rate  and  quoting  it  for  all  members,  while  if  evidence  of  considerable 
difference  between  members  is  present,  then  the  individual  rates  seem  most 
appropriate.   Some  form  of  compromise  will  be  worthwhile  for  intermediate 
cases.   The  following  general  setup  formalizes  situations  and  provides 
compromise  estimates  that  tend  to  pool  the  data. 

Consider  a  collection  of  J  equipments  or  other  units  that  independently 
generate  events  in  accordance  with  Poisson  processes  of  constant  rate  A.. 
Observations  of  these  processes  are  available:   for  unit  i,  s. (=0, 1 , 2. . . ) 
events  have  been  observed  over  an  exposure  time  interval  t. ,  i=1,2,...I.   To 
describe  the  possible  variability  between  rates,  characterize  A.  as  the 
independent  realization  of  a  random  variable  A  with  fixed  parametric  density 

function  g  (*;9),  where  8  is  a  generic  vector  parameter.   The  density  g  can 

A  A 

be  said  to  describe   a  superpopulation  of   rate  parameters,    sample  values   from 


which  have  been  bestowed  upon  the  units  of  interest.  The  first  objective  of 
the  analysis  will  be  to  utilize  all  available  data  to  estimate  the 
prevailing  superpopulation  parameters,  _9;  the  second  is  to  mobilize  the 
estimated  superpopulation  parameters,  possibly  by  Bayes'  formula  or  an 
alternative,  to  provide  suitably  pooled  or  shrunken  estimates  for  individual 
rates.   Both  point  and  interval  estimates  are  desirable.  Models  of  the 
above  type  are  called  parametric  empirical  Bayes  (PEB)  models;  see  Morris 
(1983)  for  a  review  with  various  references.   Our  present  approach 
emphasizes  superpopulation  specifications  that  lead  to  robust  estimates  in 
the  sense  that  the  possibility  of  widely  discrepant  rates  or  exponential 
parameters  is  automatically  dealt  with  by  the  superpopulation  form.   Such 
performance  can  be  called  discrepancy  tolerant;  it  resembles  in  various  ways 
the  behavior  of  modern  robust  location  estimation  and  regression  techniques, 
cf.  Mosteller  and  Tukey  (1977);  we  call  our  procedure  robust  parametric 
empirical  Bayes  (RPEB).  General  ideas  of  robust  Bayesian  analyses  have  been 
described  by  Berger  (1980,  1984);  Albert  (1979)  in  an  unpublished  study 
considers  the  Poisson  case.  The  simultaneous  estimation  of  Poisson  means 
has  been  considered  by  many  authors;  a  recent  high-level  account  is  by 
Johnstone  (1984),  who  provides  many  references.   See  also  Martz,  H.  F.  and 
Waller,  R.  A.  (1982),  which  describes  work  in  the  system  reliability  areas. 


The  model  described  is  simplistic  in  recognizing  just  two  sorts  of 
variability  in  point  event  data:   the  ordinary,  Poissonian  sampling 
variation  of  observations  around  a  given  X-value  ("within"  variations)  and 
the  variation  of  the  individual  X --values  around  a  fixed,  unknown  value 
("between"  variation).   Of  course,  many  elaborations  are  possible.   A 


natural  possibility  to  consider  is  that  rate  variation  is  controlled  in  part 
by  operational  factors  such  as  temperature,  vibration,  maintenance  frequency 
and  adequacy,  etc.,  describable  by  a  regression  model.   Another  possibility 
is  that  individual  rates  are  themselves  realizations  of  random  processes, 
possibly  with  the  addition  of  trends,  thus  requiring  representation  of  time- 
dependent  over-Poisson  variations;  see  Cox  and  Lewis  (1966)  and  McCullagh 
and  Nelder  (1983),  pp.  131-133.   The  present  paper  does  not  deal  with  these, 
but  extensions  are  in  progress. 

The  emphasis  of  this  paper  is  data- analytical .   Algorithms  are  first 
constructed  for  estimation  of  superpopulation  parameters;  confidence  regions 
associated  with  these  are  constructed  and  displayed  graphically.  The 
superpopulation  parameter  estimates  are  then  applied  to  compute  point  and 
associated  interval  estimates  of  individual  rate  parameters.  Much  of  this 
latter  process  is  carried  out  numerically  and  displayed  graphically  as  well. 
New  shortcut  and  computationally  economical  approximate  solutions  to  the 
above  problems  are  furnished  and  compared  to  complete  Bayes  solutions.  The 
procedures  are  applied  to  three  sets  of  reliability  data,  and  the  results 
are  discussed.   Despite  the  formal  probabilistic  underpinnings  described  for 
the  procedure,  it  seems  reasonable  to  apply  the  methods  in  an  exploratory 
fashion  to  probe  for  structure  in  data  sets.   This  process  has  been  briefly 
illustrated  for  one  example. 


2.      Some  Illustrative  Data  Sets 


Here  are  some  data  sets   that  serve  to  motivate  our  later  analyses 


2.1      Failure  rates   of  air-conditioning  equipment. 

A  classical  data  set  to  which  our  analysis  appears  applicable  is  that 
of  failures  of  air  conditioning  equipment  on  13  Boeing  720  aircraft;  these 
data  were     originally  provided  by   Proschan   (1963),   and  have  been  much 

studied.     We  consider   an  initial   analysis  that  takes  each  aircraft  to  have  a 

-1 
constant   individual  mean  time  to  failure,    X.    ( i  =  1  ,2, 3. . • 1=1 3)   and  an  i.i.d. 

exponential   time  to  failure.      The  data  can  be  summarized  in  terms  of   numbers 

of  failures  over  an  exposure  time;    see  Cox  and  Lewis   (1966);    for  further 

discussion  see  Cox  and  Snell    (1980). 

Note  that  actual   time-to-individual-failure  data  is  available  for   each 

individual   equipment.      An  initial   data  analysis   of  each  unit's  failure 

pattern  failed  to  reveal   substantial   trend  or   evidence  of   departure  from   an 

exponential    failure  law.      The  likelihood  function  for    X.,    an   individual 

exponential   law  parameter,    is  of   the  Poisson-gamma  form  with  s.    the 

sufficient  statistic,   so  the  data  is  presented  as  such  in  Section  3,    and 

provisionally  analyzed  to  elicit  between-A  .    variability.      The  columns  headed 

r.    in  the  following  tables   include   the  raw   quotient    (individual   mle)   rates 

r.=s./t..      The   cases    in  our   tables   are  ordered  by    increasing  raw  failure 
ill  j  & 

rate,    r .  . 
l 


Table    1.1 
Air-conditioner  Failures 


3  . 

t. 

r . 

1 

l 

i 

2 

0.623 

3.21 

9 

1.800 

5.00 

14 

1.832 

7.64 

15 

1.819 

8.25 

12 

1.297 

9.25 

6 

0.639 

9.39 

23 

2.201 

10.45 

29 

2.422 

11.97 

6 

0.493 

12.17 

16 

1.312 

1  2.20 

27 

2.074 

13.02 

24 

1.539 

15.59 

30 

1.788 

16.78 

Aircraft  No. 

i 
11 

9 

5 

4 
12 
10 

2 

3 

1 

13 

7 

8 
6 


A  maximum  likelihood  ratio  test    (Cox  and  Lewis,   pp.    235-236)    further 
indicates   that   the  individual   parameters   are  significantly  different   at 
about  a  2  percent  level.      Thus  these  data  can  be  expected  to  exhibit  some 
between-rate  variability,   as  measured  by  a  scale   (e.g.   standard  deviation) 
parameter   of  the  superpopulation. 

2.2.      Loss   of   feedwater   flow. 

Table  2  presents  a  set   of   data  referring  to  the  rates   of   loss   of 
feedwater  flow  for  a  collection  of  nuclear   power  generation  systems;    see 
Kaplan    (1983).      This   class   of    "initiating  events"    is   important    in  the 
probabilistic   risk   assessment    (PRA)    of   nuclear   plants.      It   has   been  treated 
in  an   empirical   Bayesian  fashion  by   Kaplan,   although  he   does   not   use  that 
terminology . 


Table   1.2 
LOSS    OF   FEEDWATER   FLOW 


System . 

3  . 

1 

h 

l 

3 

0 

8 

0.041 

19 

0 

2 

0.17 

1 

4 

15 

0.27 

7 

2 

5 

0.4 

18 

1 

2 

0.5 

8 

4 

4 

1.0 

16 

3 

3 

1  .0 

25 

1 

1 

1.0 

4 

10 

8 

1.3 

10 

4 

3 

■1.3 

15 

4 

3 

1.3 

5 

14 

6 

2.3 

13 

10 

4 

2.5 

27 

5 

2 

2.5 

20 

3 

1 

3.0 

2 

40 

12 

3.3 

9 

13 

4 

3.3 

26 

10 

3 

3.3 

12 

14 

4 

3.5 

14 

7 

2 

3.5 

24 

12 

3 

4.0 

28 

16 

4 

4.0 

29 

14 

3 

4.7 

21 

5 

1 

5.0 

30 

58 

11 

5.3 

17 

11 

2 

5.5 

22 

6 

1 

6.0 

6 

31 

5 

6.2 

11 

27 

4 

6.8 

23 

35 

5 

7.0 

2.3     Pump  Reliability  data  at   a  pressurized  water  reactor    (PWR)   nuclear 
power  plant. 

In  Table   1.3,    there  appears   a  small   set   of   data  representing  failures 
of   pumps   in  several   systems   of   the  nuclear   plant  Farley    1 .      The  apparent 
variation  in  failure  rates   has   several   possible  sources;    some  are  mentioned 
later.      These   data  may   be  found   in  an  EPRI   report    (1982). 


Table    1.3 
PUMP    FAILURES 


System .  s .  t .  r . 

J  1  1  i  i 


1 

5 

94.320 

2 

1 

15.720 

3 

5 

62.880 

4 

14 

125.760 

5 

3 

5.240 

6 

19 

31 .440 

7 

1 

1.048 

8 

1 

1.048 

9 

4 

2.096 

10 

22 

10.480 

5.3x10? 

6.4x10' 

8.0x1 0_ 
11.1x10~p 
57.3x10 
60.4x10_f 
95.4x10  d 
95. 4x1 0~: 

191x10 


3.   Specific  PEB  Models 

Consider  two  parametric  families   as   representations   of   an  assumed 
superpopulation  for  the  event   rates.      These   are    (i)    the   centered  and  scaled 
log<-Student   t,    which   includes   the  log-normal   when  the   degrees   of   freedom 
parameter,   n,    becomes  infinite;    and    (ii)    the  gamma. 

Form    (i),    the   log-Student,    is  of   potential    interest   in  probabilistic 
risk   analysis   of  nuclear   power  systems    (PRA),    where  the  log-normal   form   has 
long  been   used  to  characterize   variability   between  failure  rates   for   various 
equipments;    see  the  Reactor  Safety   Study    (1975),    and  Kaplan   (1983).      The 
log-Student   generalizes    the   log-normal    setup,    admitting  systematically 
heavier-than-normal/Gaussian  tails   and  so   allowing  for   a   greater-than- 
Gaussian   propensity   for   extreme  outliers   for   the  rates.      The   tail    behavior 
is  regulated   by  n,    the  Student   degrees   of   freedom   parameter.      We   do   not   here 
attempt   to  estimate  n   from   data,   but   treat    it   as   a  tuning  parameter,   much   in 


the  manner  of  the  tuning  constant,  c,  appearing  in  bi-weight  regression;  see 
Mosteller  and  Tukey  (1977).   Form  (ii) ,  the  gamma,  is  the  natural  conjugate 
prior  associated  with  the  Poisson  likelihood,  and  hence  yields  pleasant 
analytical  simplicity. 

Here  are  the  formal  descriptions  of  the  PEB  models  considered. 

Log-Students  Stage  1  (Sampling  individual  rates  from  the  superpopulation) : 


X .  -  exp(e. ) 


,       v            C(n) 
e  m   g  (z;M,x;n)  = . n  n 

~     5  z-y  .   1    (n+1)/2         U*U 

[i  *  t— )2  -  1 


C(n)  being  the  appropriate  normalizing  constant;  {e.,  1=1 ,2 1}  is  a 

sequence  of  independent  random  variables. 


Stage  2  (Observations  from  the  individual  rates) 


s.  | A."  Poisson  (A.t.  )  (3.2) 


z-y 
Apparently  e.  ~  <t>( ■],  the  normal  distribution,  as  n->  °°,  this  is  the  log 

normal  model  favored  by  many  PRA  analysts.   Note  that  in  general 


n 


Var 


2 


[e^  =■  Var[ln  A.]  -  (  -^r  ]t  ,   n>2 


(aw)  a 
Gamma;   Stage  1:   A   '  g  (w;a,B)  -  e     (  — rr— —  ) 


(3.3) 


Stage  2:  s.lA.  ~  Poisson  (A.t.).  (3.4) 

i ' -i  .i  i 

There  seems   to  be  no  fundamental  justification  for   either 
parametric  superpopulation  form.     Generally,    the  log-Student   is  appealing 
because  of   its  controllable  long  tails   and  the  ease  of   interpretation  of 
normal   variation,   while  the  gamma  has  mathematical   convenience  to  recommend 
it.      Neither  represents  truly  eccentric  behavior  such  as  multi-modality  or 
extensive  asymmetry  —  features   that   cannot  be  ruled  out   in  real   data.      See 
Laird   (1978)   and  Copas   (1984)   for  non-parametric  approaches   to  this   problem, 
and  Tukey   (1974)    for   an  exploratory,    totally  non-probabilistic  approach  to  a 
large  data  set   of   similar  structure. 


4.   Fitting  the  Superpopulation  Models:   Stage  1. 

Given  data  of  the  form  exhibited  in  Tables  1.1,  1.2,  and  1.3,  it  is 
possible  to  estimate  the  parameters  in  the  superpopulation  form  by  various 
methods.  We  examine  two:   simple  moment  matching  and  maximum  likelihood. 


4.1      Crude  moment  matching. 

From  the  Poisson  assumption  and  familiar  conditioning  arguments,   one 
can  obtain  these  formulas: 


E[s.  |X    ]    =  X.t.    =   Var[s.  |X    ], 


E[s.]    =   ECX.lt.;    Var[s.]    =   E{  Var[s  .  |  X  .  ]}    +   Var  {E[s.  |  X  .  ]} 


So,   unconditionally, 


E[s.]    =   E[X]t., 


Var[s.]    =»   E[X]t.    +   Var[X]t.2.  (4.2) 


Consequently   if  the  raw  rates  are  modelled  by  r.    =   s./t., 


E[r.]    =   E[X]  (4.3) 


1 

Var[r.]    =   Var[X]    +   E[X]   — -  (4.4) 

1 

which  suggests  that  crude  estimates  for  E[X]  and 


Var[X]  can  be  obtained  by  matching  moments 


10 


11 
E[A]   -  r,   Var[A]   =  s^  -  r(  -r-  £     t~1    )  (4.5) 


i-1 


Now  for  the  specific  forms  considered  we  know  that 


log-Normal:      E[A]    =  exp(p   +  -5-—)    ;        Var[A]    =   (E[A])2(eT     -   1)  (4.6) 


6  8 

Gamma:      E[A]    -  ■    ;  Var[A]    -  — 5-  (4.7) 

-a  a 


and  so  both  u  and  t2,  or  a  and  B,  can  be  assessed,  perhaps  inefficiently  but 
very  conveniently,  by  using  (4.5)  in  conjunction  with  (4.6)  or  (4.7).   Note 
that  E[A],  and  hence  Var[A],  is  not  finite  for  the  log-Student  model,  and 

hence  this  simplest  momentr-matching  procedure  is  inapplicable.  Under  the 
circumstance  that  s.  is  large,  accurate  moment  approximations  for  In  (r.)  = 

ln(s./t.)  are  obtainable  for  the  Student  superpopulation,  provided  the 

Student  parameter  n  is  large  enough  (i.e.  >  2) .   A  more  refined  iterative 
estimating  procedure  has  been  furnished  for  fitting  the  gamma  in  Hill,  et  al 
(1984),  but  the  above  formulas  are  extremely  simple  and  useful  for  quick 
appraisals,  and  handy  as  a  start  for  iterative  likelihood  maximization. 

4.2  Likelihood  Methods. 

It  is  anticipated  that  the  method  of  maximum  likelihood  will  provide 
results  superior  to  crude  moment  matching  at  the  expense  of  greater 


1  l 


computational  effort,  particularly  for  the  log-Student  form.   Here  are  the 
likelihoods,  and  comments  concerning  their  maximization. 

log^Student :  The  likelihood  contribution  of  observation  i  is,  up  to 
irrelevant  constants, 


Li  =  Li^u,T;  3i»ti»n) 


-  '  e-*(z)tl  CX(z)]3i  z    '*  (n„)/2  (..8) 

-  [1  *  l—f  -  ] 


with  A(z)  =  exp(z),  so  the  total  likelihood  is 


L(w,t;s,  t;n)  =   n  L . ( y, t; s,t ;n)  (4.9) 

i  =  1  l 


The  integration,  and  subsequent  maximization,  must  be  carried  out 
numerically.   Integration  has  been  performed  by  several  alternative  Gauss- 
Hermite  procedures.   The  first  begins  by  approximating  the  integral  by 
Laplace's  Method,  and  concludes  by  Gauss-Hermite  integration  of  a  correction 
term  remaining  after  the  Laplace  effect  is  removed;  see  Gaver  (1985)  for 
details;  for  short,  this  process  will  be  called  LGH.   The  second  is  a  direct 
Gauss-Hermite  procedure  adapted  from  Naylor  and  Smith  (1982);  we  are 
grateful  to  J.  C.  Naylor  for  furnishing  a  FORTRAN  program  that  has  served  as 
the  basis  for  this  aspect  of  our  work;  call  this  GH.   The  maximization  was 
accomplished  in  the  first  method  by  a  refined  grid  search,  and  in  the  second 
by  a  quasi-Newton  procedure  adapted  from  IMSL  SUBROUTINE  ZXMIN,  operating  on 
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the  log-likelihood  surface.   The  classical  EM  algorithm  discussed  by 
Dempster,  Laird  and  Rubin  (1977)  is  applicable  for  estimating  the 
superpopulation  parameters,  but  does  not  directly  produce  approximate 
confidence  regions,  as  obtained  below. 

Gamma:  The  likelihood  contribution  of  observation  i  can  be  derived  by 
integration,  and  is  the  negative  binomial  expression 


r(s  +B)       t.Si  aB 
L.(a,e;s.,t.)    -        r(fl)        J   VB  •  C«.10) 

i   i 


In  view  of  independence,  a  product  of  these  contributions  provides  the  total 
likelihood,  as  in  (4.9).   Maximization  of  the  log  likelihood  has  then  been 
carried  out  by  the  IMSL  procedure. 

The  numerical  results  obtained  from  applying  the  above  procedures  to 
the  three  illustrative  data  sets  are  given  in  Figs.  4.1,  4,2,  and  4.3.   In 
order  to  ease  the  comparison  of  the  log-Student  and  gamma  analyses,  we  have 
re-parameterized  the  gamma  in  terms  of  y  and  x,  the  latter  being  the 
parameters  of  a  log-normal.   Thus  the  \i   and  t  log-normal  values  that  match 
the  first  two  gamma  moments  are 


6 /a         

v   =  ln[ ],  r  =  /  ln(1+1/S)  ;  (4.11) 


•1  +1  /B 


these  expressions  have  been  used  to  parameterize  the  gamma-likelihood  for 
graphical  display. 
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4.3   Approximate  Confidence  Regions  for  Superpopulation  Parameters 

The  likelihood  ratio  test  procedure  has  been  used  to  define  an 
approximate  joint  confidence  region  for  \i   and  x  for  the  two  superpopulation 

model  families.   The  procedure  specifies  that  all  (u,  t)  values  such  that 

*  *  2  "    " 

-2[ln(L(  u,  t;  s,_t  ;n)/L(  u,  x;n) )]  ^  x ?(1^a)  ,  where  (u,t)  is  the  mle,  constitute 

an  approximate  (1-a)»100%  confidence  region.   The  regions  obtained  for  the 
three  sets  of  data  appear  on  the  figures.  The  somewhat  eccentric,  but 
unimodal,  shape  of  the  log-likelihood  surface  is  exhibited  by  the  confidence 
contour  plots;  a  bit  more  symmetry  can  in  principle  be  obtained  by  re- 
parameterizing  in  terms  of  In  t,  but  for  our  data  sets  the  effect  was  not 
dramatic.   The  confidence  contours  are  roughly  elliptical  with  the  ellipse 
semi-axes  nearly  parallel  to  the  y-i  axes;  an  analysis  based  on  the 
simplifying  assumption  that  u  and  t  are  independently  bivariate  normal  works 
reaonably  well  for  our  data.  The  ellipticity  tends  to  disappear  when  the 
data  set  is  small  and  contains  several  s.=0  values;  as  anticipated  the 
region  then  often  intersects  the  t=0  axis,  suggesting  that  the  data  are 
consistent  with  a  single  underlying  parameter  value:   A.=A,  i  =  1 , 2, . . . ,  I,  if 
the  intersection  is  pronounced. 

5.   Individualized  ("Shrunken"  or  "Pooled")  Estimates. 

If  the  true  values  of  u  and  t  (log-Student)  or  a  and  8  (gamma) 
superpopulations  were  available,  then  an  obvious  step  would  be  to  compute 
the  Bayes  posterior  of  e.  =  In  A . ,  or  of  A .  in  the  gamma  case,  given  the 

value  of  s . .   Then  any  point  or  interval  estimates  desired  could  be 
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computed.   Such  calculations  must  be  done  numerically  for  the  log-Student 
family,  but  are  eased  in  the  gamma  case  by  the  conjugate  prior  assumption. 
If  the  values  of  u  and  t  are  estimated  from  data,  as  suggested  here,  then 
approximate  superpopulations  can  be  derived  by  replacing  (m,t)  by  (m,t)  and 
calculating  the  corresponding  Bayes  estimates;  see  Deely  and  Lindley  (1981) 
for  discussion  of  this  empirical  Bayes  approach.   Recent  work  by  Morris 
(1983)  and  Hill  (1984)  suggests  refinements  to  the  simple  procedure 
described.   Use  of  the  approximate  individualized  superpopulations 
(approximate  Bayesian  posteriors)  then  leads  to  point  estimates  and 
intervals.  We  have  chosen  to  first  calculate  (i)  the  means  e  «E[e  |s.]f  of 

the  posterior  distributions  for  the  individual  unit  log  rates,  e.,    in  the 

illustrative  data  sets;  these  can  be  compared  with  ordinary  log  raw  rates, 

1  /2 
ln(s./t.);  (ii)  the  standard  deviations  c=[var[e  .  |s  .  ]    ,  of  the  posterior 

distributions,  (iii)   approximately  95%   upper  tolerance  limits  (or  95?  one- 
sided Bayes  credibility  intervals)  for  each  unit,  based  on  a  normal 
approximation:   e.(0.95)  =  e.  +  1.645c,  and  (iv)  upper  confidence  limits 
for  the  credibility  intervals  (iii),  that  recognize  sampling  variability  in 
u  and  t.   We  are  encouraged  to  believe  that  such  intervals  are  reasonable  by 
looking  at  plots  of  the  posterior  densities  of  the  e. ;  see  Figures  4.1,  4.2, 
and  4.3.   More  exact  calculations  are  possible  by  numerical  integration  of 
the  posterior.   Explicit  expressions  for  the  above  quantities  are  these: 


Log-Student :   The  approximate  conditional  means  and  second  moments  are  cases 
of 
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Ecfi  lsI]< 


L  (uiT;a.,t. ;n) 


k     -A(z)t. 


z      e 


[A(z)]; 


dz 


Z-M 


[1  ♦(  -^ r  '   ] 


(5.1) 


2      1       T(n+1)/2 


n 


again  integrated  by  Gauss-Hermite  quadrature.   The  normalized  integrand  of 

k 
(5.1),  exclusive  of  z  ,  is  the  approximate  Bayes  posterior  density  of  &., 

given  s . . 


Gamma:   The  mean  and  variance  of  the  approximate  gamma  posterior  have 
familiar  pleasant  explicit  forms: 


ECAjL|si:  = 


s.  +  6 


t.  +  a 

l 


(5.2) 


and 


s.  +  6 


VarE[A. |s.]  - 


(5.3) 


(t.  +  a) 


There  are  no  such  simple  expressions  for  e.  =  InA .  in  the  gamma  case,  but 

the  posterior  moments  have  been  computed  by  Gauss-Hermite  quadrature  applied 
to  the  log-transformed  gamma  density. 
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5. 1   Analytical  Approximations  to  the  Posterior 

Although  the  above  numerical  computations  are  feasible,  it  is  often 
useful  to  have  relatively  simple  and  explicit,  if  approximate,  formulas  for 
point  estimates  and  posterior  densities.   One  such  can  be  derived  for  the 
log-Student  model  by  writing  the  posterior  density  as 


,.   x    „  ~(1/2)Q(z)  ._  ... 

g  (z|s.)  =  K  e  (5.4) 


z-y(z)   _ 
and  then  approximating  Q(z)  by  a  quadratic  q(z)  =  (1/2)( -. — r  )  '  in  the 

manner  of  Laplace's  method,  c.f.  N.  G.  de  Bruijn  (1957);  for  applications  to 

Bayesian  statistics  see  Mosteller  and  Wallace  (1964),  and  Kadane  and  Tierney 

(1985).   Differentiation  shows  that  the  minimum  of  Q(z)  occurs  at  z  . -   e., 

where  e.  is  the  modal,  or  maximum  likelihood,  estimate  of  e.ls.,  and  e.  is 
l  _i ' _i       l 

the  posterior  mean. 


Log-Student :   The  derivative  of 


n+1  z-u  ? 

Q(z)  =  -ezt.  +  Siz  -  {—2~ )   lnt1  +  (  —*— )(1/n)]  (5.5) 


set  equal  to  zero  yields  an  estimating  equation  that  can  be  written  as 
follows: 


ci  "  p 
A.  =  eei  =  [s.  -  ( )  wn(e.)](1/t.),  (5.6) 
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where  the  weight 


n+1  1 

wn(e.)  =  (— — ) ~  ■  (5.7) 

e.  - ■  y  _ 

1  ♦  (  -^—  r    (1/n) 


Graphical  or  analytical  examination  reveals  the  possibility  of  two 

A 

solutions  of  Q'(z)  -  0,  one  very  near  \i   and  the  other  near  ln(s./t.), 
corresponding  to  a  bimodal  posterior.   Convenient  explicit  analytical 
criteria  for  biraodality  to  occur  are  not  available;  but  neither  our  present 
data  sets,  nor  many  others,  have  revealed  such  bimodality  when  the  posterior 

A 

was  evaluated  numerically  and  plotted.   Our  approach  is  to  replace  w  (e.)  by 

w     =  w   (ln(s./t.))   and  by  w     =  w   (ln(1/(3t.))   when  s.=0.      This  approximate 
nnii  nn  i  l 

weight  leads  to  unique  solutions  of  (5.6)  by  Newton-Raphson  iteration.   An 
interesting  and  interpretable  formula  is  obtained  after  one  iteration, 

A 

starting  with   e.(0)    =   ln(s./t.): 


A  «  ^ 


s.    ln(s./t.)    ■+    (m/t2)w  (ln(s./t.  )-u)(1/(t2))w 

i            i     l                         n  11                             n 

e    (1)« ~ — r-5 — —  =   ln(s./t.) .    (5.8) 

1                      s.         +        (1/(t)    )    w                   X      l  s.    +    (1/(t)2   w 

i                                       n  in 


This  expression  resembles  the  familiar  Bayes  normal-theory  formula  for 

combining  prior  mean  and  likelihood  to  obtain  a  linear  estimator  of  the 

posterior  mean.   The  difference  is  the  presence  of  the  weight  w  ,  the  effect 

n 

of  which  is  to  reduce  the  influence  of  the  mean  of  the  superpopulation 
(prior)  upon  tail-discrepant  observations.  Discrepant  observations  (w 
small)  are  not  heavily  shrunken  or  pooled  towards  the  estimated  center,  y, 
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while  observations  close  to  that  center  (w  large)  are  shrunken  in  that 

n 

direction.   Actually,  the  net  shrinkage  is  caused  by  the  factor 

(1/(t)2w  /[s.  +  ((x)2)w  ],  in  which  w  plays  an  important  but  not  exclusive 
n   1         n  n    J 

role:  the  (approximate)  variance  s.and  (t)2  are  also  significant  (note  that 
w  depends  upon  x  and  upon  ln(s./t.),  so  shrinkage  is  not  linear).  Notice 
that  as  n->°°,  and  the  log-Student  approaches  the  log-normal,  the  discrepancy- 
tolerant  effect  diminishes;  when  n=°°  there  is  only  one  solution  to  (5.5)  and 
skrinkage  becomes  linear  (provided  the  effect  of  observation  i  on  u  and  t  is 
small,  as  it  usually  is).  The  variance  of  the  posterior  can  be  assessed 
from  the  second  derivative  of  Q(z);  from  (5.5) 


Var[e.|s.]  =  a.2  =  — * ■  (5.9) 

e  ei  t.  +  (1/t  )w 
i         n 


This  formula  again  exhibits  the  behavior  of  the  variance  associated  with  the 
posterior  encountered  when  normal  likelihoods  are  combined  with  normal 
conjugate  priors,  except  that  wildly  tail-discrepant  observations  are 
substantially  downweighted:   automatically  in  such  cases  e.    =  ln(s./t.)  and 

JO  J  111 

o2  =  1/s.  as  is  essentially  correct  for  a  simple  mle.   In  other  words,  our 
11 

approximations  (5.8)  and  (5.9)  crudely  treat  ln(s./t.)  as  normal  with  a 
conditional  mean  that  is  Student  t  with  non-negligible  variance.   Such 
approximations  are  very  convenient,  and  lend  themselves  to  simulation 
appraisals  of  the  two-stage  estimation  procedure;  see  Gaver  (1985),  and  a 
summary  in  Section  8  of  this  paper. 

Gamma:   In  the  gamma  case,  it  can  be  seen  that 
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Q(z)  =  -a'eZ  +  B'z  (5.10) 


where  a'  =  t.  +  a,  8'  =  s.  +  8.   Now  differentiation  again  shows  that 


and 


s  +  B 

e£i  =  (B'/a)  =  -  (5.11) 

t.  +  a 

l 


i   B 


Naturally,  these  formulas  resemble  the  results  for  the  log-Student 
superpopulation,  but  contain  no  weight,  w  ,  to  reduce  shrinkage  effects  upon 
tail-discrepant  observations. 


6.   Confidence  Limits 


Since  the  estimates  of  posterior  means,  variances,  and  tolerance  limits 

are  functions  of  u  and  x,  it  is  desirable  to  place  confidence  limits  on 

-        2 

£.(m,t),  o.(m,t),  and  e.(u,t).   These  may  be  based  on  the  confidence 

contours  of  Figs.  4.1-4.3,  and  are  constructed  numerically.   We  have 

supplied  only  upper  95?  confidence  contours,  obtained  by  grid  search  over 

(m,t)  space  to  maximize  e.(u,x),  say,  under  the  condition  that  (u,x)  belongs 

to  the  appropriate  confidence  set;  these  limits  are  denoted  by  e, . 


7.   Analysis  of  Data  Sets 
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The  estimation  procedures  described  have  been  applied  to  the  data  sets 
of  Tables  1.1-1.3.   Complete  tabulations  are  available  from  the  authors; 
here  we  examine  only  those  log  parameter  estimates  that  are  at  the  low  and 
high  extremes  for  each  data  set,  and  the  middle  or  median  level.   Ordering 
of  the  rates  is  in  terms  of  log  raw  rates.   It  is  anticipated  that  the  point 
estimates  of  the  extreme  individual  rates  will  exhibit  the  greatest 
variation  across  estimation  procedures  (superpopulation  models) ,  while  the 
middle  values  will  be  roughly  in  agreement.   Owing  to  the  partial  pooling 
effect,  the  middle  values  will  tend  to  exhibit  somewhat  smaller  posterior 
variation  than  the  extremes.   The  normal  superpopulation  model  tends  to 
shrink  more  extensively  than  the  other  superpopulation  models.   By  and 
large,  these  effects  are  observed.   For  a  visual  notion  of  the  posterior 
densities  from  our  data  sets  see  Figs.  7.1-7.  . 

7.1   Table  Notation 

The  following  notation  has  been  used  in  the  table  headings:   under 
Estimates, 

(1)  e(r)  =  ln(s./t.)  =  ln(r.) 

(2)  e(1,n)  =  linearized  posterior  mode,  Student  (n)  prior;  (n=5)  here 

(3)  e(n)  =  posterior  mean,  Student  (n)  prior 
(*0  e(g)  =  posterior  mean,  Gamma  prior 

(5)   e(°°)  =  posterior  mean,  normal/Gauss  prior. 

The  numbers  in  parentheses  under  each  of  the  above  are  the  standard 
deviations  of  the  associated  posteriors;  posterior  means  and  standard 
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deviations  are  either  computed  by  numerical  integration,  in  cases  (3),  (5), 
or  by  simple  explicit  approximate  formulas  in  cases  (1),  (2),  and  (4). 
Under  Intervals,  there  are  included  approximate  upper  95%  Bayes  credibility 
intervals  based  on  a  normal  approximation  (mean  +  ( 1 .645) (standard 
deviation)),  these  are 

(6)  e(r)  =  e(r)  +  1.645  o(r)   =   e. (r)+1 .645/  1 /s. ,  using  1/s.,  the 
simpliest  delta-method  approximation  to  var[ln(s. /t. )] ,  while  in 
[   ]  we  quote  the  upper  limit  computed  making  use  of  the  chi- 
squared  distribution  of  the  time  to  accumulate  s  failures,  an 
approximation  to  the  former; 

(7)  e(1,n):   same  as  preceding  using  Student  (n)  prior,  (n=5),  and 
linearized  estimates,  see  (5.8)  and  (5.9); 

(8)  e(n)  :  same  as  (6)  but  using  (3),  and  associated  standard 
deviation; 

(9)  e(n)  upper  95%  confidence  limit  on  e(g),  as  described  in  Section 
6; 

(10)  e(g)  :  same  as  (6),  using  moments  of  log-gamma  computed 
numerically; 

(11)  e(g)  :  upper  95%  confidence  limit  on  e(g),  as  in  Section  6; 

(12)  e(°°)  :  same  as  (8),  using  estimated  normal  prior; 

(13)  e(°°)  :  upper  95%  confidence  limit  on  e(g),  as  in  Section  6. 
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7.2  Air^-conditioner  Data 

Upward  shrinkage  of  the  smallest  estimate,  e(r),  is  most  pronounced  for 
£(°°)  ,  the  normal  prior,  less  so  for  the  gamma,  and  still  less  so  for  the 
Student  (5)'s;  the  simple  linear  approximation  least  so.   The  linearized 
Student  (5)  procedure  gives  a  small  weight  to  the  smallest  observed  rate. 

Upper  interval  boundaries  differ  less  than  point  estimates,  with  the  e 

MP 

levels  only  slightly  above  e. 

The  middle  estimate  is  shrunk  not  at  all  numerically  by  any  of  the 
point  estimates,  but  the  standard  deviations  of  all  shrunken/pooled 
estimates  are  about  70$  of  that  of  the  raw  estimate  e(r).   Upper  interval 
levels  are  correspondingly  reduced. 

The  largest  estimate  is  shrunk  downwards  slightly  and  consistently  by 
all  estimates,  shrinkage  is  less  extensive  for  the  largest  than  the 
smallest;  this  can  be  partly  explained  by  the  weights:   0.52  vs  0.13- 

7.3  Feedwater  Flow  Data 

The  smallest  observation  is  a  zero,  and  the  crudely  imputed  rate  is 
e(r)  -  ln(1/3t.);  it  is  enclosed  in  parentheses  to  signify  its  arbitrary 
nature.   Here  all  point  estimates  provide  some  upward  shrinkage:   the  normal 
prior  estimate,  e(°°)  ,  shrinking  upwards  the  most  extensively;  it  also 
exhibits  the  smallest  standard  deviation.   Here  the  Student  (5)  credibility 
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and  confidence  intervals  tend  to  be  lower  than  the  corresponding  gamma  and 
normal  intervals. 

The  first  middle  estimate,  (i)=15  in  rank,  is  shrunk  downwards  by 
perhaps  10%;    the  most  extensive  shrinkage  occurs  for  the  normal  model,  e(°°)  . 
Its  shrunken  standard  deviation  is  about  80?  of  that  of  the  raw  for  the 
Student  model.   Note  that  this  observation  involves  s=3  events  over  exposure 
time  t=1 ,  a  short  history.   By  contrast,  its  neighboring  middle  value  (i)=l6 
in  rank,  with  the  more  extensive  history  s-1 3  over  t«4,  exhibits  one-half 
the  shrinkage  and  very  little  standard  deviation  decrease. 

The  largest  estimate  is  shrunken  nearly  the  same  by  all  estimates;  the 
upper  intervals  agree  well  internally,  tending  to  be  slightly  below  the 
interval  raw  rate  interval,  e(r). 

7.4   Farley  Pump  Data 

A 

The  smallest  observation,  e(r)  in  this  data  set  is  shrunk  towards  the 
mean  to  essentially  the  same  degree  by  each  alternative  point  estimate; 
slightly  less  shrinkage  occurs  for  the  gamma  and  Student  (5)  models.   The 
upper  95?  credibility  limits,  e,  also  agree  for  all  models,  with  e(1,5) 

being  marginally  the  greatest.   The  upper  confidence  limits,  e,  are  in 
agreement  as  well. 

The  two  median  values  at  (i)=5,6  are  individually  treated  consistently 
so  far  as  point  estimates  go:   all  shrunken  estimates  reduce  the  log  raw 


24 


rate  towards  the  mean,  with  the  greater  shrinkage  occurring  for  (i)=5 
because  of  its  smaller  experience  (failure  count  and  exposure  time).   For 
the  same  reason,  posterior  standard  deviations  for  (i)=5  are  more  than  twice 
as  great  as  those  for  (i)=6,  and  upper  95%   credibility  intervals  and 
confidence  limits  reflect  this  difference  as  well. 

The  point  estimates  associated  with  the  largest  log  raw  rate  all 
substantially  agree  in  their  modest  downward  shrinkage,  and  the  upper 
credibility  and  confidence  limits.   Again,  the  close  agreement  is 
attributable  to  the  relatively  extensive  experience  embodied  in  unit  ( i ) =1 0 . 

It  is,  however,  worth  notice  that  the  estimated  scale  parameter,  x,  is 
quite  large  for  this  data  set.   A  plausible  reason  is  that  the  units  in  the 
set  are  not  truly  homogeneous,  and  that  much  of  the  large  variability  is 
explainable  by  classification  or  regression.   Our  estimation  procedures  tend 
to  reflect  this:   although  weights  w  are  rather  similar  for  extreme  and 
middle  observations,  the  actual  shrinkages  are  small  even  when  there  is 
little  experience  (e.g.  for  (i)=5).   In  fact,  investigation  reveals  that  the 
four  pumps  with  the  greatest  experience  (relatively  large  s  and  long  t)  all 
operate  continuously,  while  the  remaining  six  operate  intermittently  or  on 
standby;  these  latter  display  consistently  higher  failure  rates  than  the 
former,  so  a  dummy  variable  (continuous  vs  intermittent)  regression  model 
should  tend  to  reduce  x.   In  Fig.  7.^  we  exhibit  the  result  of  a  re-analysis 
in  which  the  two  groups'  estimates  of  u  and  x  are  found  separately:   the  two 
point  estimate  vectors  are  now  much  more  consistent,  the  confidence  regions 
are  smaller,  and  only  partially  overlap. 
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8.      Simulation  Results 

Limited  simulation  experiments  have  been  carried  out   to  evaluate  some 
of  the  estimation  procedures   described.      Here  is  the  design;    see  Gaver 
(1985)    for  further  details. 


First,   the  superpopulation  form  for   the  Poisson  log  rates   e     was   taken 
for   convenience   to  be   a  member   of    the   controllably   long-tailed  Tukey  h 

family: 

2 

e.    _y  +   tz.    exp(hz.)   with  z.    _  N(0,1)    and  h,    the  tail-stretching  parameter, 

non-negative   (here  h=0.15);    see  Hoaglin   (1983),   and  Gaver    (1983)    for   details 

concerning  this  family.     We  wished  to  compare  the  treatment   of  the  different 

rate-values   in  a  random  sample  from   the  superpopulation  by  various 

estimators,   so  ordered  A -values  were  next   created   (and  stored): 

X,.v    =  exp    (e,..),    e,.,    =   u   +    tz...    exp    [h   z,..],    z,.,    being  the   i        largest 

order  statistic   in  a  sample  of   size   I    from  N(0,1).      For   h   >  0    the  extremes 

A,.,    and  X/T.    tend  to  be  outliers,   while  the  median,  1+1      ,    is 

(1)  (I)  (-2-) 

characteristic  of   a  central   value.      Second,    the  A... -values  were  used  to 

(i) 

generate   Poisson  counts,    s,.>.      Then  Stage    1    and  Stage   2   estimation 
processes   were   applied   to  estimate  first   u,    t,    and  then  the   individual    A.- 
values.      The  speedy  LGH   procedure  was   used  to  estimate  u  and   t,    and  these 
values   were  then   used   in  conjunction  with  the   approximation  A.    =   exp    (c.) 
that  solves    (5.6)    by   iteration.      Detailed  numerical    quadrature   using  the  GH 
procedure  is  perhaps  superior,   but   would  have   consumed  more  computer   time. 
The  squared  differences   of    the  estimated   A.    values   and  their  true 
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counterparts  were  then  averaged  over  S(=200)  simulations  and  quoted  as  mean- 
squared  errors  (MSE).  Table  8.1  summarizes  results  for  several  such 
experiments.  We  have  quoted  the  ordinary  units  estimate  results  as  MLE,  the 
results  of  applying  the  estimating  formulas  (5.6)  as  RS  (Restricted 
Shrinkage,  as  governed  by  w  ) ,  and  the  results  of  applying  (5.6)  without  the 
weight  as  SS  (Simple  Shrinkage);  the  latter  approximately  represents  the 
effect  of  applying  a  log-Student  model  when  n=50. 

The  results  obtained  are  suggestive  if  not  dramatic.   First,  estimates 

2 
of  the  superpopulation  mean  \i   are  nearly  unbiased,  while  those  for  x  appear 

biased  low.   Standard  errors  of  estimates  (figures  in  parentheses)  are,  not 

surprisingly,  substantial;  apparently  more  simulation  repetitions  would  be 

desirable.   Nevertheless,  comparison  of  the  MSE  figures  for  the  various 

estimators  implies  that  RS(n=iO  has  virtue:   for  the  smallest  and  largest 

rates,  A,,,  and  I,,..,  RS  estimates  resemble  MLE  performance,  while  SS  over- 
\\)  (Id) 

shrinks  and  for  the  middle  value,  U,,  RS  is  far  superior  to  the  simple 

individual,  unpooled  MLE.   The  numerical  differences  in  MSE  shown  in  the 
table  are  small  but  real  because  of  positive  correlation  between  estimate 
values  on  each  simulation  experiment. 
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Table  8. 1 

Selected  Mean  Squared  Error  Comparisons 
and  Estimated  Superpopulation  Parameters 

J-15,   h-0.15,    200   Simulations 
Student   d.f.    (tuning  constant)   n=4,50 


True 
Values 


Estimated 


Estimator  A...    (small)      X/0.    (median)    A , , _ . (large) 

v l)  (o ;  (id; 


(n-4) 


U=K).97(0.41) 
t2=0.17(0.15) 


RS 


0.016 


0.019 


0.33 


u— 1.0 


t   =0.25 


MLE 


0.007 


0.030 


0.32 


(n-50) 


u=-0.98(0.45) 


x  =0.18(0.15) 


SS 


0.019 


0.020 


0.35 


(n=4) 


u— 1.93(0.50) 


RS 


0.050 


0.0060 


0.28 


t  -0.18(0.17) 


U— 2.0 


t   =0.25 


MLE 


0.0026 


0.01  4 


0.27 


(n-50) 


u=-1.93(0.52) 


t   =0.20(0.18) 


SS 


0.005*4 


0.0057 


0.30 
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9.      Summary  and  Conclusions 

This  paper   displays   the  results  of  analyzing  several   small   batches 

(optimistically,   but  not  realistically,   random  samples)    of  event  rate  data 

as   if    (i)   parameters  of   each  batch  were  drawn  independently  from  a  fixed 

superpopulation,    and  then   (ii)   the  batches  themselves  were  samples  from 

random  processes,   here  stationary  Poisson  or   exponential-interval.      Such  is 

at  least  a  pleasant  fiction,  to  be  used  as  a  starting  point.  Computational 

methods  have  been  used  to  obtain  estimates  of  superpopulation  parameters, 

and,  from  these  pooled  or  shrunken  individualized  (log)  rate  estimates  were 

obtained.   Such  parametric  empirical  Bayes  (PEB)  analyses  have  been 

described  before  by  Hill,  et  al  (1984),  Deely  and  Lindley  (1981),  Hinde 

(1982),  Kaplan  (1983)  and  perhaps  others.  We  extend  these  by  introducing  a 

heavy-tailed  superpopulation  form,    the  logrStudent  t,    that   allows  for 

outliers  or  tail   discrepancies   incompatible  with  the  log- normal /Gauss 

description.     We  call  this  the  RPEB  setup.      The  qualitative  effect   of  such  a 

generalization  is  revealed  by  appearance  of   the  weight,   w    ,   that  selectively 

reduces   the  linear  shrinkage  towards  a  center;    see   (5.8).      Thus  w     plays  a 

role  similar  to  that   of   an  influence  function  in  robust  location  estimation 

(see  Mosteller   and  Tukey    (1971),   p.    351),    although   in  the  estimation  of   a 

single  log  rate  it  curbs  the  influence  of  the  overall  mean,  \i,    on  that 

estimate  if  the  data  give  evidence  of  extreme  discrepancy.   A  more  complete 

indication  of  the  effect  of  an  observation,  i.e.  ln(s./t.)  =  e.(r),  on  its 

ii  l 

own  shrinkage   is  given  by   the   quantity      (1/(t)2)w   /[s.    +    (1/(x)2)w   ] 
appearing   in  the  rightmost   expression  in   (5.8):      both  within   variability, 
measured  by   s. (=var[e. (r ) ])    and  between   variability,    assessed  by    (t)2,   play 
their  parts   along  with  w    .      Besides   providing   insight,    expressions   like 
(5.8)    and    (5.9)    seem   to  agree  reasonably  well   with  more   exact   results,    and 
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are  easy  to  compute,    especially   if  one  settles  for  inefficient  moment 
estimators  of  superpopulation  parameters. 

As   the  Tables   and  Figures  reveal,    the  example   data  analyses   performed 
do  not  show  enormous   differences  between  log-normal,    log-Student (5) ,   and 
gamma  superpopulation   (Bayes  prior)   specifications,    especially  for  the 
median  and  also  for  the  largest  batch  values.      The  smallest  batch 
observations   are  treated  similarly  by  gamma  and  Student(5),    with  the 
normal/Gauss   representation  tending  to  shrink  a   "small"    (zero)  value  more 
extensively  than  do  the  others   up  towards   the  center,    u,    on  the  log  scale. 
As  anticipated,    other   analyses  indicate  even  less  tail  shrinkage  by 
Student(n)   for   n<5.      Estimation  of   n  from   the  batch  values  would  be  of 
interest,    but  is  unlikely  to  be  done  with  much  accuracy  from  scanty  data. 
This  suggests   that  use  of   a  gamma  form  for   the  prior,   and  hence  for  the 
posterior  may   be  relatively  harmless.      There  is  little  evidence  in  our 
examples   that   over^shrinkage     of   the  largest  values   in  a  data  set  occurs 
when  the  gamma  specification  is   used    (although  a  small-n  Student   analysis 
could  be  performed  as   an  indication  of   the  possible  extent   of   over- 
shrinkage)  .      Certainly   the  gamma   is   technically  convenient  for   computing 
point   estimates   of   reliabilities   or   availabilities   of   complex   systems: 
integrations   can  often  be   carried  out   explicitly   as  Laplace  transforms. 

Of   considerable   interest   would  be   the  reduction  of    the   apparent   between 
variability   by  classification  or  regression,    as  briefly   illustrated  for  the 
Farley   data.      Research   in  this  area  is   currently   in  progress,    with  promising 
results.      If   part   of   the   between  variability   could   be  suitably  accounted 
for,    then   estimators   could  be   constructed   that   legitimately  pool    towards 
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appropriate  individualized  centers,    y.    rather  than   \i,    and  outliers  could  be 
explained  and  reduced  in  effect.      All  of   the  above  requires  attention  to 
collection  of  representative  current   data,   and  the  monitoring  of  analytical 
results   over  time  to  check  for   changes,   e.g.    in  basic  parameters.      Our 
present  analysis  is  only  a  step  in  this  direction.      Further  generalizations 
include  analyses  for  failure-on-demand  data,   for  which  responses  are  binary 
and  explanatory  variables  could  include  the  time  durations  between 
inspections  or  serious   activations.      Analyses   of   complex  redundant  systems 
have  been  proposed  in  Gaver  and  Lehoczky   (1985). 
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